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METHOD FOR THE IDENTIFICATION OF A METABOLIC PATHWAY 
FAMILY BY MEANS OF POSITIVE SELECTION 



5 Field of the invention 

The invention relates to the direct selection of metabolic pathways having a determined 
function in the transformation of a substrate {A} into a target product {B}, which is of 
interest in the industrial, chemical, pharmaceutical, cosmetic, agrochemical or 
nutraceutical field. More specifically, the invention relates to the detection, within 
10 metagenomic libraries, of novel biosynthesis pathways involved in a biochemical 
reaction which leads to the product {B}. By the selection and characterisation of said 
novel metabolic pathways enabling {B} to be produced enzymatically, the invention 
both provides an alternative to the chemical synthesis of the molecule in question {B}, 
and enables the synthesis of product {B} which up till now was inaccessible chemically. 

15 

Background of the invention 

Biocatalysis, defined as the biological synthesis of the molecules in question 
enzymatically, has been becoming more popular by offering a strong alternative to 
chemical synthesis, in terms of cost, time, purification steps, and simplicity of use. The 

20 introduction of any new biocatalysis process on an industrial scale necessitates, 
however, (i) identifying the enzyme (or the enzymes) which make(s) it possible to 
specifically convert the substrate provided into the desired product, (ii) identifying the 
enzyme (or the enzymes) which make(s) it possible to implement the catalysis in a 
stable manner and in the particular conditions linked to the industrial process 

25 (thermostability, pH, tolerance to denaturation conditions of organic solvents, etc). 

Due to their universal distribution, including in the most extreme environments, 
microorganisms are known for being able to perform totally original enzymatic 
functions and in conditions compatible with the industrial processes mentioned above. 

30 

However, the promising approach of exploiting these bacterial functions has always 
been considerably limited by a technological obstacle : the isolation and in vitro culture 
of the enormous potential offered by the bacterial diversity. Most bacteria developing in 
complex natural environments (soils and sediments, aquatic environments, digestive 
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systems, ...) have not been cultivated because their optimal culturing conditions are 
unknown or too difficult to reproduce. Numerous scientific works demonstrate this 
established fact, and it is now widely admitted that only between 0.1 and 1% of the 
bacterial diversity, including all environments, have been isolated and cultivated 
5 (Amann et al 9 1995, Microb. Rev., 59 :143-169). Even if the search for novel 
biocatalytic pathways within collections of microbic strains has proved to be effective, it 
nevertheless has the major disadvantage of only exploiting a tiny part of the bacterial 
biodiversity. 

New approaches have been developed in order to overcome this critical point of 
isolating bacteria and in order to gain access to this enormous genetic potential offered 
by the adaptation systems of bacteria developed over their long evolution. This 
approach is called Metagenomics because it relates to a set of genomes from a bacterial 
community without any distinction (metagenome). 

Metagenomics involves the direct extraction of DNA from environmental samples, their 
propagation and their expression in cultivatable bacterial hosts. Metagenomics in the 
strict sense was first of all used for identifying new bacterial phyla (Pace, 1997, Science, 
276 :734-740). This approach is based upon the specific cloning of genes recognised for 
their phylogenetic interest, such as for example DNAr 16S^ Other developments have 
been implemented in order to identify new enzymes of environmental or industrial 
interest (Terragen Diversity Patent N° US 6,441,148). In these two approaches, 
metagenomics starts with a selection of the desired genes. This selection is made by a 
PCR (Polymerase Chain Reaction) approach, generally before the cloning step. In the 
latter case, the cloning vector is preferably an expression vector (i.e. it contains 
regulating sequences upstream of the cloned fragment of DNA, enabling it to express 
the cloned DNA in a give expression host). 

More recent developments consider the metagenome as a whole. Thus, no selection and 
30 no identification is made before the metagenomic DNA library is created, in a totally 
random fashion. This approach therefore gives access to the whole genetic potential of 
the bacterial community being explored without any a priori. 
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In general, bacteria play an important role in the function of ecosystems. In fact, they 
are well represented quantitatively. For example, it is estimated that one gram of soil 
can contain between 1 000 and 1 0 000 different species of bacteria with between 1 0 7 
and 10 9 cells, considering cultivatable and non-cultivatable bacteria. Reproducing this 
5 whole diversity in metagenomic DNA libraries requires the ability to generate and 
manage a large number of clones. 

In this latter approach, the DNA libraries are made up of several dozen, hundreds of 
thousands, or even several million recombinant clones which differ from one another by 

10 the DNA which they have incorporated. For this, the average size of the cloned 
metagenomic inserts is of the utmost importance in the search for bacterial biosynthesis 
pathways because most of the time these pathways are organised in clusters in the 
bacteria. The larger the cloned fragments of DNA (larger than 30Kb), the more the 
number of clones to be analysed is limited and the greater the possibility of reproducing 

15 complete metabolic pathways which make it possible to obtain the conversion of a 
substrate {A} into a target product {B} and into a source of growth. 

Given the large number of recombinant clones to be studied and the number of trials to 
be carried out, numerous laboratories are tending to use high density hybridisation 
20 systems (high density membranes or DNA chips), in particular for the characterisation 
of bacterial communities (for a review, see Zhou et al., 2003, Curr. Opin. Microbial., 
6 :288-294). 

Even if none of these data relate to metagenomic libraries, they nevertheless provide a 
25 great deal of information such as the quantification of different functional genes (Cho et 
al., 2003), the study of functional genes and their diversity (Wu et al, 2001, Appl. 
Environ. Microbiol., 67 :5780-5790) and the direct detection of DNAr 16S genes (Small 
et al., 2001). Just one study relates to the use of metagenomics in combination with 
DNA chips (Sebat et al., 2003, Appl. Environ. Microbiol., 69 : 4927-4934) for the 
30 identification of clones containing DNA which has come from non-cultivatable bacteria 
and their selection for additional analysis. 
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The screening of enzymatic activities or of antibacterial activities from metagenomic 
libraries has been widely described in the scientific literature. The studies have related, 
for example, to the direct detection of chitinase (Cottrell et al., 1999, Appl. Environ. 
Microbiol., 65 :2553-2557), lipase (Henne et al, 2000, Appl. Environ. 
5 Microbiol., 66 :3 11 3-3 11 6), DNA, and amylase (Rondon et al, 2000, Appl. Environ. 
Microbiol. ,66 :2541-2547) etc... activity. In these studies, the host bacteria containing 
the recombinant clones are placed in culture on a medium complemented by the 
substrate of which metabolisation is sought, and the screening of the activity is 
generally based upon the appearance of haloes or precipitates around the colonies, or by 

10 a change to the appearance of the colonies which are metabolising the substrate being 
studied. It should be noted that the enzymatic activities detected by means of these 
examples are new activities for the host bacterium, but are not essential for the growth 
of the latter in the examples provided. A similar approach was described in the patent 
(Chromaxome N° 5,783,431). This patent describes a method of screening activity 

1 5 based upon the encapsulation of individual or pooled clones from a library in a stable, 
inert and porous matrix (advantageously alginate), in the form of macro- or micro- 
droplets. The droplets are for example subjected to a liquid culture containing the 
nutritive elements necessary for bacterial growth and a substrate (for example X- 
glucosaminide, X-acetate, X-glucopyranoside, etc..) the metabolisation of which is 

20 expressed byjthe appearance of blue, colouring. 

Alternatively, the phenotypical screening described in the Proteus Patent (N° FR 2 786 
788) is based upon a prior preparation of the nucleic acid sequences encoding the target 
protein (upstream and downstream elements necessary for the transcription and 
25 translation of the target genes), the in vivo transcription and translation, and then the 
detection and measurement of the activity of the target proteins. 

All of these screening methods require the use of high throughput systems because they 
involve subjecting all of the clones to the screening test in order to identify the clones in 
30 question which respond positively to the tests. For this purpose, the company Diversa, 
leader in the domain of the discovery of new molecules, has developed a unique 
platform, called the GigaMatrix, enabling ultra-high throughput screening, of around 1 
billion clones per day flittp://www.diversa.corn/techplat/gjgamatrix/default.asp ). 
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Another approach has already been described in patent WO 00/22170 of Microgenomics 
(N° US 6,368,793 Bl). This patent describes a methodology for identifying a metabolic 
pathway transforming a substrate S into a desired product T by creating or identifying a 
5 genetically manipulated organism of which the capability of implementing this reaction 
is placed under the control of an inducible promoter. This organism is used for 
screening fragments of nucleic acids in order to detect a gene involved in the 
transformation of a substrate into a product. The implementation of this method requires 
the identification and genetic characterisation of the genes responsible for the 

1 0 degradation of T in the expression host so that they can be placed under the control of 
an inducible promoter. This type of construct can not always be considered, in particular 
when the genes in question are spread over the genome and there is a possible risk of 
"leaking" into the inducer. On the other hand, it represents extremely hard work which 
has to be repeated for every study of a product T. Finally, in this approach, the organism 

1 5 used must be capable of incorporating and metabolising S and T. All of the elements 
mentioned demonstrate the limits of the efficacy of this type of approach. 

The majority of these technologies, with the exception of that described by 
Microgenomics, therefore require the prior organisation of libraries, i.e. the 

20 individualisation, storage and preservation of the clones in formats compatible with the 
screening systems mentioned above. Moreover, the adequacy of a metagenomic library 
for a given problem (for example the search for a specific enzymatic function) can only 
be established when all of the clones making up this library have been subjected to the 
screening. Several hundreds of thousands of clones must often be screened in order 

25 maybe to detect just one clone of interest. The creation of a metagenomic library is in 
fact subject to a certain number of limitations, such as the prior choice of the 
environment being explored, the bacterial community (or communities) being 
considered within this environment, the cloning or expression vector, the sizes of the 
cloned inserts, and the host organism likely to best express the heterological 

30 metagenomic DNA. 

The time required and the means used to create the metagenomic library and then its 
screening are therefore key, with small hope of success. An increase in the chances of 
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discovery would involve, absolutely, the creation of a metagenomic library specific to 
each problem, in order to best respond to the objectives set. 

5 Summary of the invention 

This invention relates to a method for the identification of a metabolic pathway or of 
metabolic pathway families enabling the transformation of one or more substrate(s) 
{Ai} into a desired product {B}. This method is based upon the selection or the 
preparation of cells including at least one metabolic pathway or a metabolic pathway 

10 family enabling the transformation of one or more substrates {Ai} into a desired product 
{B}. Furthermore, it enables the identification and characterisation of the gene or genes 
encoding the enzyme or the enzymes involved in the conversion of the substrate {Ai} 
into product {B}. This invention, based upon a series of transformation-selection- 
purification cycles, targets the enormous microbic potential (Figure 1). This method 

1 5 includes the following steps : 

a) providing a population of host cells (Ai- ; B-) incapable of metabolising said 
substrate or substrates {Ai} and said product {B} ; 

b) transforming said population of host cells with a library of nucleic acid 
sequences ; 

20 c) testing in parallel said population of host cells transformed on minimum 

media containing either one of the substrates {Ai}, or said product {B} as the 
only source of an element essential to growth ; and, 

d) selecting said transformed host cell or cells capable of growth on a minimum 
medium containing one of the substrates {Ai} and on a minimum medium 
25 containing said product {B} (Ai+ ; B+), then optionally isolating the nucleic acid 

molecule introduced at the time of the transformation in step b) and giving the 
phenotype (Ai+ ; B+). 

Preferably, the method includes, before step c), a step consisting of testing the 
30 population of transformed host cells on a minimum medium containing the substrate(s) 
{Ai} and said product {B} as the only source of an element essential to growth and 
selecting said transformed host cell or cells capable of growth on said minimum 
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medium containing the substrate(s) {Ai} and said product {B}; said selected host cell(s) 
then being subjected to step c) and the subsequent steps. 

The method according to this invention can also include, after step d), the following 
5 steps : 

e) implementing in vitro mutagenesis, or any other method known by the man 
skilled in the art which leads to the same result, of the nucleic acid molecule 
isolated from said host cell or cells (Ai+ ; B+) transformed in step d) ; 

f) re-transforming the population of host cells (Ai- ; B-) described in step a) with 
10 the population of nucleic acids mutated in vitro in step e) and testing the host 

cell(s) thus transformed on minimum media containing either one of the 
substrates {Ai}, or said product {B} as the only source of an element essential to 
growth; and, 

g) selecting said transformed host cell(s) incapable of growth on a minimum 
15 medium containing one of the substrates {Ai} and capable of growth on a 

minimum medium containing said product {B} (Ai- ; B+), then optionally 
isolating the mutated nucleic acid molecule. 

In addition, the method can include the characterisation of the gene or genes encoding 
20 the enzyme or enzymes involved in the conversion of the substrate {Ai} into product 
{B} and isolated from said transformed host cell(s) (Ai- ; B+) selected in step g). 

In a first alternative, the method includes, after step f), instead of or parallel to step g) : 

h) selecting said transformed host cell(s) which has (have) become incapable of 
25 growing on a minimum medium containing one of the substrates {Ai} and on a 

minimum medium containing said product {B} (Ai- ; B-) ; 

i) carrying out a quantitative analysis of the accumulation of the product {B} of 
said transformed host cell(s) (Ai- ; B-) on a rich medium supplemented with 
{Ai} ; and 

30 j) selecting said transformed host cell(s) (Ai- ; B-) accumulating the product {B} 

on a rich medium and optionally isolating in parallel the mutated nucleic acid 
molecule introduced during the transformation of step f). 
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In addition, the method can include the characterisation of the gene or genes encoding 
the enzyme or enzymes involved in the conversion of the substrate {Ai} into product 
{B} isolated from said transformed host cell(s) (Ai- ; B-) selected in step j). 

5 In a second alternative, the method includes, after step c), instead of or in parallel to step 
d) and the subsequent steps, the following steps : 

k) selecting said transformed host cell(s), incapable of growth on a minimum 
medium containing one of the substrates {Ai} and capable of growth on a 
minimum medium containing said product {B}, called receiving cells (Ai- ; 
10 B+) ; 

1) transforming said receiving cell(s) (Ai- ; B+) with a library of sequences of 
nucleic acid; 

m) testing in parallel said transformed receiving cell(s) (Ai- ; B+) on a minimum 
medium containing one of the substrates {Ai} ; 
1 5 n) selecting said transformed receiving cell(s) capable of growth on a minimum 

medium containing one of the substrates {Ai} ; and 

o) characterising the gene or genes encoding the enzyme or enzymes involved in 
the conversion of the substrate {Ai} into product {B} and isolated from said 
transformed receiving cell(s) (Ai+ ; B+) selected in step n). 

20 

Said library of sequences used in step m) can be the same as that used in step b) or be a 
distinct library. If said sequence library is (i) the same as that used in step b), i.e. the 
selection marker (resistance to an antibiotic, or auxothropy marker) is the same as that 
present in the receiver cell(s) of phenotype (Ai- ; B+), or is (ii) different from that used 

25 in step b) but nevertheless relates to the same selection marker, then it is necessary to 
modify the selection marker for the nucleic acid sequence giving the phenotype (Ai- ; 
B+), as described in steps kk) to kkkkk). Said modification, advantageously based upon 
the replacement of the initial resistance to an antibiotic by a resistance to a second 
antibiotic, will make it possible to apply to step m) a double selection pressure making it 

30 possible to select the transformed cells containing the two nucleic acid sequences : the 
nucleic acid sequence present initially and giving the capability to grow on {B} and the 
nucleic acid sequence newly acquired and giving the capability to convert {Ai} into 
{B}. 



Preferably, the method includes, before step m), testing said host cell(s) (Ai- ; B+) 
transformed on a minimum medium containing several substrates {Ai} as the only 
source of an element essential to growth and selecting said host cell(s) capable of 
growth on said minimum medium containing several substrates {Ai}; said selected host 
cell(s) then being subjected to step m) and the subsequent steps. 

In this second alternative, the invention also relates to a method in which : 

- between steps k) and 1), said host cell(s) (Ai- ; B+) is/are modified by the 
replacement of the first selection marker present in the vector containing the 
nucleic acid sequence introduced in step b) by a new selection marker ; 
said library of nucleic acid sequences from step 1) includes a different selection 
marker to that carried by said host cell(s) (Ai- ; B+) 
the method further includes the following steps : 

kk) the extraction and purification of the vectors contained in said host 
cell(s) selected in step k) ; 

kkk) the in vitro mutagenesis of said vector purified in step kk), 
advantageously by transposition with a transposable element carrying for 
example a functional resistance to an antibiotic different from that 

already existing on this vector. 

kkkk) the transformation of said host cell(s) (Ai- ; B-) incapable of 
metabolising said substrate or substrates {Ai} and said product {B} by 
the mutated nucleic acids obtained in the previous step ; 
kkkkk) the selection of the transformed host cells containing just said 
second selection marker ; these transformed cells, of phenotype (Ai- ; 
B+), formally called receiving cells, are then the object of the 
transformation described in step 1). 

Said host cells are eukaryotic or prokaryotic cells. Preferably, they are : 

cultivatable in standard conditions known by the man skilled in the art, 
transformable or competent and 

capable of stably maintaining the transforming exogenous DNA. 
In one preferred embodiment, said host cells are bacteria. 
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Said library of nucleic acid sequences can be a metagenomic library. In a first 
embodiment, said library of nucleic acid sequences comes from cultivatable prokaryotic 
or eukaryotic organisms. In a second embodiment, said library of nucleic acid sequences 
5 comes from non-cultivatable prokaryotic or eukaryotic organisms. 

In one preferred embodiment of the invention, the element essential to growth is carbon. 

Preferably, the selection marker is a resistance gene to an antibiotic. 

10 

The invention also relates to host cells selected by the methods according to this 
invention, and to their use, in particular in bio-processes using these host cells capable 
of transforming one or more substrates {Ai} into a desired product {B}. More 
specifically, this invention relates to the use of a host cell selected in step g), j) or n) of 
15 the methods according to this invention in a process for preparing the product {B} from 
the substrate {Ai}. 

The invention also relates to the gene or genes encoding the enzyme or enzymes 
involved in the conversion of the substrate {Ai} into product {B} identified by the 

20_ methods^ of this invention,^ vector containing it or them, a transformed cell containing 
it or them, as well as any use of the latter. More specifically, this invention relates to the 
use of a transformed host cell with the gene or genes encoding the enzyme or enzymes 
involved in the conversion of the substrate {Ai} into product {B} characterised 
according to any of the methods according to this invention in a process for preparing 

25 the product {B} from the substrate {Ai}. 

Finally, this invention relates to a method for the identification, selection, or preparation 
of a host cell (Ai- ; B-) incapable of metabolising said substrate or substrates {Ai} and 
said product {B} including the following steps : 
30 - testing a population of host cells, cultivatable in standard laboratory conditions 

and in industrial production conditions, transformable, and capable of stably 
maintaining the transforming exogenous DNA, on a minimum medium 
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containing the substrate(s) {Ai} and said product {B} as the only source of an 
element essential to growth; and, 

selecting the host cell(s) incapable of growth on said minimum medium 
containing the substrate(s) {Ai} and said product {B}. 

5 

The host cell can be a prokaryotic or eukaryotic cell, preferably a bacterium. 

The invention also relates to a host cell cultivatable in standard conditions known by the 
man skilled in the art, transformable or competent, capable of stably maintaining the 

10 exogenous transforming DNA, and incapable of growth on said minimum medium 
containing the substrate(s) {Ai} and said product {B} as the only source of an element 
essential to growth, in particular a host cell obtained by the process indicated above. 
Furthermore, it relates to the use of this type of host cell for the identification of at least 
one metabolic pathway or metabolic pathway family enabling the transformation of one 

15 or more substrate(s) {Ai} into a desired product {B}. 

i is a whole number between 1 and n, more specifically between 1 and 100, and 
preferably between 1 and 50 or 1 and 10. 

20 

Description of the figures 

Figure 1 : General diagram of the process for detecting metabolic pathways. 
Figure 2 : Diagram of the primary transformation-selection cycle. 
Figure 3 : Diagram of the secondary transformation-selection cycle. 
25 Figure 4 and Figure 5: Diagram of the alternative secondary selection cycle. 

Detailed description of the invention 

This invention proposes selecting and identifying a metabolic pathway enabling the 
30 transformation of a substrate {Ai} into a product {B}, not dependent upon the 
creation/identification of an organism capable of metabolising {B} under an inducible 
signal into an essential component and not dependent either upon the capability of 
incorporating the desired product {B}. The invention makes it possible to specifically 



12 

exploit metabolic pathways originating from organisms capable of producing the target 
{B} but also capable of catabolising it. In fact, because the invention specifically 
exploits the metabolic pathways making it possible to convert a substrate {Ai} into a 
product {B}, it makes it possible to eliminate all of the associated catabolism pathways 
5 of the product {B} capable of affecting the accumulation of {B}in the original organism. 

The invention proposes considerably reducing the time and the costs associated with the 
search for a new enzymatic function within metagenomic libraries, making the desired 
function directly detectable by positive selection. Preferably, the desired function is 

10 essential to the survival of the recombined cell. Detection of this metabolic pathway, 
leading non-exclusively to the transformation of a substrate {Ai} into a target product 
{B}, therefore involves the compound {B} being, directly or indirectly, strictly required 
for the growth of the host cell. The invention is distinguishable de facto from a 
complementation search process which would aim to detect the metabolic pathways 

1 5 enabling the host cell to grow on {Ai} . 

In a first embodiment the process of the invention includes a primary selection- 
transformation cycle (figure 2) including the following steps: 

20 A- The identification of a host strain, preferably bacterial, incapable of developing 

on one (or more) substrate(s) {Ai} (i between 1 and n) and on the product {B} of the 
desired function ; 

B- The transformation of said strain identified by a library of nucleic acid 
25 sequences, preferably environmental DNA, cloned in an appropriate vector; 

C- The primary selection of recombinant clones on a minimum medium containing 
the substrate(s) {Ai} and the target product {B} as the only sources of an element 
essential to growth (figure 1). The recombinant clones capable of metabolising at 
30 least one of the precursors provided (one or more substrates {Ai} and/or the target 

product {B}) are preserved, then tested in parallel on minimum medium containing 
just one of the precursors. This primary selection therefore makes it possible to 
select in one step three types of phenotype : 
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type 1 : recombinant clones capable of growing both on a (or several) 
substrate(s) {Ai} and on the target product {B}. These clones of phenotype 
(Ai+ ; B+) are likely to convert {Ai} into {B} and so are therefore subjected to 
the following step ; 

type 2 : recombinant clones capable of growing only on a (or several) 
substrate(s) {Ai} but not on the target product {B}. These clones of phenotype 
(Ai+ ; B-) are a priori incapable of producing the target product {B} ; and, 
type 3 : recombinant clones capable of growing only on the target product {B}. 
These clones of phenotype (Ai- ; B+) can advantageously enable the 
development of a receiving strain of the host organism useable for detecting by 
direct selection any recombinant clone capable of synthesising the target product 
{B}. These clones can be used in an alternative transformation-selection 
embodiment described below. 

The capability of growing on {Ai} and {B} can be associated or be independent (cf 

figure 2). 

D- The in vitro mutagenesis of the nucleic acids from the transforming sequence 
library and isolated from the clones with a phenotype (Ai+ ; B+). 

E- The parallel selection on {Ai} and on {B} of the clones resulting from the 
transformation by the mutated nucleic acids, enables the identification of the 
transformed clones affected by the mutagenesis (figure 3). This selection makes it 
possible to select : 

- transformed clones resulting from mutation, of phenotype (Ai- ; B-), having lost 
the capability of growing on {Ai} and on {B}. This change of phenotype can be 
explained either because (i) the metabolic pathway of {Ai} passes via {B} and 
the metabolism of {B} is disrupted (mutated phenotype Ilia), or because (ii) the 
mutagenesis has reached an element common to {Ai} and {B} such as for 
example a regulation element, a common transporter etc. (phenotype Hlb). 

- transformed clones produced by mutation, of phenotype (Ai- ; B+), having only 
lost the capability to grow on {Ai}. The metabolic pathway in question enabling 
the transformation of {Ai} into {B} is disrupted (mutated phenotype IV). 
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F- The quantitative analysis of {Ai} and {B} by direct or indirect analytical 
methods, by means of phenotypes (Ai- ; B-). 

G- The genetic characterisation of the biocatalyst (function : {Ai} is converted into 
5 {B}) by means of the phenotypes (Ai- ; B+) or the phenotypes (Ai- ; B-) 

accumulating the product {B} on a rich medium supplemented with {Ai}). 

Only phenotypes (Ai- ; B+) and (Ai- ; B-) of the transforming clones produced by in 
vitro mutagenesis enable the identification and characterisation of the novel metabolic 
10 pathways sought transforming {Ai} into {B}. The accumulation of {B} and its chemical 
detection is implemented by culturing the phenotypes (Ai+ ; B+), or preferably (Ai- ; 
B-), on a rich medium supplemented with {Ai}. 

An advantage of the process of the invention is only considering in this type of primary 
15 selection the positive clones because only these clones have the capability of 
developing. Thus, it is not necessary to screen all of the clones contained in the library. 

In an alternative embodiment of the invention, when the primary selection step of the 
first embodiment does not make it possible to detect clones having a phenotype (Ai+ ; 

20 _ B+),_any (Ai- ; B+) phenotypes offer the possibility of developing a receiving strain of 
phenotype (Ai- ; B+) capable of being co-transformed by a second metagenomic library. 
This alternative embodiment makes it possible to exploit within the metagenomic 
library the clones (Ai+ ; B-) capable of converting at least one of the substrates {Ai} 
into target product {B} but incapable of metabolising {B} (clones not selected in the 

25 primary selection step of the first embodiment). 

Based upon a transformation-selection system, the invention : 

- makes it possible to directly select metabolic pathways converting a (or 
several) perfectly characterised substrate(s) into a target product of interest ; 
30 - makes it possible to directly select, in parallel, perfectly characterised 

metabolic pathways metabolising a (or several) substrate(s) as (a) single source(s) of an 
essential element. Following genetic and chemical characterisation, these metabolic 
pathways make it possible to enrich specialist enzyme libraries ; 
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- makes it possible in an alternative way to easily develop a receiving strain of 
the host organism, capable of growing on a target product of interest, and useable de 
facto for a second transformation-selection cycle for the conversion of a or several 
perfectly characterised substrate(s) into this target product of interest. This receiving 

5 strain of the host organism is characterised in that its development results from the 
temporary integration of a recombinant vector and that its own genetic patrimony 
remains unchanged. 

- makes it possible to exploit, during each transformation-selection cycle, the 
enormous genetic potential of metagenomic libraries to great effect, without it being 

10 necessary to structure in advance these libraries and, de facto, without it being necessary 
to resort to high throughput screening systems. 

The lack of prior structuring of the libraries consequently makes it possible to shift 
one's efforts into creating metagenomic libraries which optimise the chances of 
1 5 discovering the target metabolic pathway. 

The invention further relates to the selection of a host cell incapable of metabolising a 
substrate {Ai} and incapable of metabolising a desired product {B}. Preferably, this 
host cell is a bacterial host. These are, non-restrictively, E. coli, Bacillus, Streptomyces, 
20 Pseudomonas, Nocardia, Acinetobacter etc... This cell must have the capability of 
being transformable by any of the techniques known by the man skilled in the art. This 
can be, non-restrictively, transformation by electroporation, by conjugation, by 
transduction, by infection, etc.... This cell is used as an expression host for 
individualising the vectorised fragments of DNA. 

25 

Definitions 

Metagenome means all of the genomes of a microbic community of a given 
environment. 

30 Metagenomics means, in the strict sense of the term, the global analysis of a 
metagenome independently of any artificial culture of the microorganisms. It is 
commonly accepted, beyond the direct study of the genetic information (metagenomic 
DNA), that metagenomics is based upon the prior creation of metagenomic libraries. 
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Metagenomic DNA library means a population of metagenomic DNAs cloned in a 
cloning or expression vector, enabling the transfer and maintenance of this 
metagenomic DNA in a (or several) host organism(s). The metagenomic library can be 
5 non-redundant, in that each cloned metagenomic molecule of DNA is unique, or can be 
amplified, in that each cloned metagenomic molecule of DNA has been multiplied. 

Metagenomic library means a population of clones from a host organism having 
incorporated the population of cloned metagenomic DNAs, as referred to above 

10 (recombinant clones). The metagenomic library can be non-redundant in that every 
recombinant clone is unique, or can be amplified, in that every recombinant clone has 
been multiplied. The amplification of an original non-redundant library thus enables the 
division of the amplified library into sub-libraries, within which the diversity remains 
representative both of that of the amplified library and of that of the original non- 

1 5 redundant library. 

Recombinant vector means an expression or cloning vector having integrated an 
exogenous genetic information, for example genomic DNA or metagenomic DNA. 

20 Recombinant clone means a population of identical clonal cells of a host organism 
having integrated a recombinant vector, for example by genetic transformation. 

Biocatalytic pathway means a set of catalytic proteins (enzymes) implementing the 
conversion of a starting compound (substrate) into a final compound (target product). 

25 

Metabolic pathway family means a set of nucleic acid sequences, the expression product 
of which is capable of implementing the transformation of a substrate {A] into a product 
{B}. 

30 Shuttle vector means a vector enabling the transfer and the maintenance of a genetic 
information from one (or more) donor bacterial species or strain(s) to one or more host 
organism(s) or strain(s) or species. 
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Description of the environments 

The invention is first of all based upon the creation of metagenomic libraries originating 
from an environmental sample. Soil and sediments form major environments for the 
search for novel active metabolites, not only due to the very large quantity of 
5 microorganisms that they contain, but also due to the considerable diversity of these 
microorganisms. Microorganisms have been detected in a very great number of 
environments, ranging from the stratosphere to abyssal depths, including extreme 
habitats in terms of the physicochemical conditions which prevail there. The invention 
applies non-restrictively to samples taken from soil, sediments, aquatic environments 
10 (fresh or sea water), plants, insects, animals, bioreactors such as biofilms, fermenters 
and activated sludge, but also from animal- or human-derived environments (such as for 
example rumen, faeces), and advantageously all environments having a quantitative 
(strong concentration of microorganisms) or qualitative (specificity of the microbic 
community or communities) advantage. 

15 

The environmental sample contains a multitude of organisms including eubacteria, 
archaebacteria, algae, fungi, yeasts, protozoans, viruses, phages, parasites, etc. The 
microorganisms can be represented by extremophiles such as thermophiles, 
psychrophiles, acidophiles, halophiles, etc. The environmental sample can contain 
20 cultivatable or non-cultivatable, known or unknown microorganisms, as well as_ free 
nucleic acids and organic matter. 

Preparation of the nucleic acids 

The environmental DNA is collected from cultivatable or non-cultivatable organisms by 
25 any of the techniques known by the man skilled in the art. Two main approaches are 
generally adopted in order to extract environmental DNA. The first approach, called the 
direct approach, consists of extracting the nucleic acids from the sample by means of in 
situ lysis of the microorganisms, followed by extensive purification of the released 
nucleic acids. The lysis of the bacterial cells can be implemented by the single or 
30 combined use of multiple processes for physically, chemically and/or enzymatically 
disrupting the cell walls and membranes, processes known by the man skilled in the art 
(for a review, see Robe et al, 2003, Eur. J. Soil Biol., 39 : 183-190). The extensive 
purification of the nucleic acids released during the lysis can be implemented, 
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individually or in combination, by numerous methods known by the man skilled in the 
art, including, non-restrictively, ultracentrifugation on cesium chloride gradients, 
passing over hydroxyapatite columns, electrophoresis on agarose gel, filtration on 
resins, or any other commercialised process for the purification of nucleic acids (for a 
5 review, see Robe et ai, 2003). The second approach, called the indirect approach, 
consists of a prior separation of the microorganisms of the sample, non-restrictively, by 
differential centrifugation or by centrifugation on density gradients, followed by lysis of 
the microorganisms separated in this way, then extensive purification of the released 
nucleic acids. The steps of lysis of the microorganisms and purification of the nucleic 
10 acids are all implemented by using, individually or combined, numerous methods 
known by the man skilled in the art and mentioned above. 

The relative efficacy of these two approaches as well as their respective advantages and 
disadvantages have been the objet of numerous scientific studies, and are known by the 

15 man skilled in the art. Establishing the strategy for extracting the nucleic acids, i.e. the 
choice of one or other of these two approaches and the choice of the different methods, 
is based non-restrictively upon the characteristics of the environment being examined, 
upon the targeted microorganisms (all or some of the microorganisms of this 
environment) and their characteristics, upon the size of the nucleic acids, and upon the 

20 choice of the cloning vector and the cloning strategy chosen. 

Host organisms 

Any vector-host system known in the prior art can be used in this invention. The 
identification of a host cell forms Step 1 of this invention. The host cell can be 

25 eukaryotic or prokaryotic. Preferably, the host cell used is a bacterial host. These can be, 
non-restrictively, Escherichia coli, Bacillus subtilis, Streptomyces lividans, 
Pseudomonas, Nocardia, Acinetobacter etc. Examples of eukaryotic host cells, without 
being restricted to these, are yeasts and fungi. The host cell (i) can originate from 
collections of public strains, from private laboratories or commercial companies ; (ii) 

30 must be selected or modified for its inability to metabolise the substrate(s) {Ai} and the 
target product {B} ; (iii) must be able to be cultivated in the standard conditions known 
by the man skilled in the art; (iv) must be capable of being transformed by any of the 
techniques known by the man skilled in the art ; (v) must finally stably maintain the 
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transforming exogenous DNA despite possible systems such as recombination, 
restriction, etc... 

Cloning and expression vectors 

5 The nucleic acids are cloned within an appropriate expression vector, maintenance of 
the DNA being replicatable. The expression vector used depends upon the size of the 
purified nucleic acids, the desired size of the insert in fine (generally between 5 kb and 
over 1 00 kb), and upon the expression host chosen which is preferably a bacterial host. 

10 Numerous cloning or expression vectors have been described in the prior art. Non- 
restrictively, these are plasmids, cosmids such as those marketed by the companies 
Stratagene (SuperCos and pWE15) and Epicentre Technologies (pWeb cosmid cloning 
kit), fosmids as described by Kim et aL (1992, Nucl. Acids Res. 20 : 1083-1085), 
artificial chromosomes PAC as described by Ioannou et aL, (1994, Nat. Genet., 6 : 84- 

15 89), artificial chromosomes BAC as described by Shizuya et aL (1992, Proc. Natl. 
Acad. Sci., 89:8794-8797), artificial chromosomes YAC as described by Larin et aL 
(1991, Proc. Natl. Acad. Sci., 88 : 4123-4132), phagemids and vectors derived from 
phages such as those marketed by the company Stratagene (Lambda Dash II and Zap II), 
viral vectors, etc. Preferably, the vectors are of the cosmid, fosmid, BAC, YAC and Pl- 

20 derivative type because they enable the cloning of large fragments of DNA (between 30 
kb and 200 kb and over for the BACs and the YACs). The vectors can either be 
integrative in that they integrate randomly or in a controlled way into the genome of the 
host cell, or preferably be replicative in that the vector is maintained in the host cell 
independently of the genome of this cell. By definition, the cloning vectors contain a 

25 certain number of elements necessary for maintaining the vector in the host cell (origin 
of functional replication), or else necessary for the selection and/or the detection of the 
vector in this cell (marker gene such as for example a resistance gene to an antibiotic 
under the functional promoter in the host cell and enabling a positive selection 
pressure). Due to the specificity of these constitutive elements, the vectors have a wider 

30 or narrower host spectrum. 



Cloning 
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The cloning process, i.e. the introduction of the sequences of nucleic acid, preferably 
purified metagenomic DNAs, into the appropriate vector, requires numerous steps of 
molecular manipulation of the DNAs (in a non-limitative way for the restrictions, 
dephosphorylations, ligations, etc.) which have been widely described, for example in 
5 Current Protocols in Molecular Biology, Eds. F.M. Ausubel, R. Brent, R.E. Kingston, 
D.D. Moore, J.G. Seidman, J.A. Smith and K. Struhl, published by Greene Publishing 
Associates and Wiley Inter-Science. Two approaches for creating metagenomic libraries 
can be considered. 

10 In a first preferred embodiment, the metagenomic library is formed directly in a shuttle 
vector specific of one or more hosts, preferably bacterial, for example as described in 
patents N° WO 01/40497A2 (Aventis Pharma, 1999) and WO 99/67374 (Biosearch 
Italia, 1999) for Streptomyces. In a second embodiment, the purified nucleic acids are 
cloned in a general vector, for example of the fosmid or BAC type, then the 

15 recombinant vectors are modified, individually or in a pool, advantageously by 
transposition as described in patent application N° PCT/EP 03/07765 (Libragen). In this 
process, the transposition makes it possible to introduce, either into the vector, or into 
the insert (disruption or activation), the genetic elements necessary for the transfer, the 
replication or the integration of the recombinant vector in the chosen host cell, 

20 preferably a bacterial host. This post-modification of the clones of the library can be 
implemented individually (metagenomic library structured in the format of 96 or 384 
microplaques) or collectively (non-structured metagenomic library). The transformation 
of the population of host cells identified in step 1) by a population of cloned DNAs 
forms Step 2 of this invention. In the two embodiments, the metagenomic library can be 

25 structured in advance in that all of the clones of the library are individualised in a format 
capable of being automated (96, 384, 1536 microplaques) or preferably be preserved in 
the form of a mixture of recombinant clones. In this preferred preservation mode, the 
library can advantageously be amplified in that the host cells, after transformation or 
infection, are multiplied over a specific number of cycles, leading to every recombinant 

30 clone of the library being represented by n copies in the amplified library, and the 
amplified library being able to be subjected to numerous simultaneous screening or 
selection tests, without any loss of diversity. 
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Detection and identification of the metabolic pathways 

Step 3 : The recombinant clones are directly selected, without a prior culture 
step, on a minimum culture medium containing both n substrates Ai (i between 1 and n) 
and the target product {B} as the only sources of an essential element, as well as an 
antibiotic (such as chloramphenicol) making it possible to maintain a selection pressure 
on the host cells having integrated a recombinant vector. This primary selection step, 
relating to an original or amplified library of several dozen or even several hundreds of 
thousands of clones, makes it possible to consider in fine just the recombinant clones 
capable of metabolising one of the substrates {Ai} independently of the target product 
{B}, or the target product {B}, or one of the substrates {Ai} by means of {B}. This 
step is optional however. 

Step 4 : The clones selected on minimum medium containing the n substrates 
{Ai} and {B} (figure 2, clones 1, 2 and 3), are preserved, and then tested in parallel on 
minimum medium containing either one of the substrates {Ai}, or {B}. For a given 
substrate {Ai}, this multiple selection makes it possible to identify three distinct 
phenotypes (figure 1) : 

- the phenotype (Ai+ ; B+) of the clones (of type 1). This corresponds to the 
desired phenotype preferably corresponding to clones capable of growing on 
minimum medium with the substrate {Ai} but also with the target product {B} 
as the only source of an essential element. The clones having this phenotype are 
preserved and subjected to the following step (figure 3). 

- the phenotype (Ai+ ; B-) of the clones (of type 2). This corresponds to clones 
capable of growing on minimum medium with the substrate {Ai} as the only 
source of an essential element independently of the target product {B}. If they 
are not processed during the subsequent steps, these clones are nevertheless 
preserved because they are capable of enriching libraries of enzymes after 
identification of the metabolic pathways involved in the metabolisation of the 
substrate {Ai}. 

- - the phenotype (Ai- ; B+) of the clones (of type 3). This corresponds to clones 
incapable of growing on minimum medium with the substrate {Ai} as the only 
source of an essential element but which are, however, capable of metabolising 
the target product {B} in order to grow. These clones can advantageously be the 
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object of the development of a receiving strain of the organism, having a 
phenotype (Ai- ; B+), useable in an alternative secondary transformation- 
selection process of the biocatalysis of {Ai} into {B} (figure 4). These clones 
are also capable of enriching libraries of enzymes after identification of the 
5 metabolic pathways involved in the metabolisation of the product {B}. 

Step 5 : The plasmidic DNA of the recombinant clones of phenotype (Ai+ ; B+), 
selected in step 4 and capable of growing both on {Ai} and on {B} is extracted by any 
of the techniques known by the man skilled in the art (Sambrook et ai, 1989 (figure 3)). 

10 This plasmic DNA is subjected to genetic disruption, advantageously by the random 
insertion of transposable elements such as those marketed (EZ ::TN Epicentre 
Technologies company) and reintroduced into the host organism (figure 3A). The 
transformants are first of all spread out on rich solid medium until the colonies appear, 
then the colonies are transferred by replicas onto minimum medium containing just the 

15 substrate {Ai} on the one hand, and just the target product {B} on the other hand. The 
capability of growing on {Ai} and {B} can be the result of a common metabolic 
pathway (figure 3B, genotype [AB]) or of two independent metabolic pathways (figure 
3B, genotype [A,B]). 

20 Step 6 : The parallel selection on {Ai} and {B} of the clones resulting from 

mutagenesis makes it possible to identify different phenotypes of interest : 

the mutated phenotype I (Ai+ ; B+)* of transposed clones capable of pushing on 
{Ai} and {B} ; either mutagenesis by transposition has no effect in the 
metagenomic insert or it affects the vector. 

25 - the mutated phenotype II (Ai+ ; B-)* of transposed clones capable of 

metabolising {Ai} in order to grow but not {B}. The metabolism of {Ai} does 
not pass via the production of {B} in order to enable growth (figure 2, genotype 
[A,B]). 

- the mutated phenotype III (Ai- ; B-)* of transposed clones capable of using 
30 neither {Ai} nor {B} as sources of growth. Either (i) mutagenesis by 

transposition has reached an element common to the metabolic pathways of 
{Ai} and {B} such as for example a regulation element, a common transporter, 
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etc ; or (ii) the metabolic pathway of {Ai} passes via {B} and the metabolic 
pathway of {B} enabling growth is disrupted. 

the mutated phenotype IV (Ai- ; B+)* of transposed clones capable of using {B} 
but not {Ai} in order to grow. The metabolic pathway in question enabling the 
5 conversion of the substrate {Ai} into the target product {B} is disrupted. 

Step 7 : The passing by {B} into the metabolisation of {Ai} is verified by 
evaluating the accumulation of {B} by techniques of analytical chemistry when a clone 
of phenotype (Ai- ; B-), isolated in step 4, develops on rich medium. 

Step 8 : The genetic characterisation of the biocatalyst, i.e. characterisation of 
the gene or genes encoding the enzyme or enzymes involved in the conversion of {Ai} 
into {B}, is implemented by means of the transposed clones having the phenotype (Ai- 
B+). The genetic analysis of the nucleic sequences located on the disruption site or sites 
of the recombinant clones (Ai- ; B+) makes it possible to elucidate the genetic system(s) 
responsible for the conversion of {Ai} into {B}. The genetic analysis is implemented by 
any methods known by the man skilled in the art, including non-restrictively 
establishing sequences of nucleic acids, identifying coding and regulating sequences, 
etc. 

This method makes it possible to gain access rapidly and directly (in a single step) to a 
metabolic pathway family capable of transforming a substrate {Ai} into a product {B}. 

Alternative transformation-selection process 

25 In an alternative embodiment of the invention, in particular when step 4 of the first 
embodiment does not make it possible to detect clones having a phenotype (Ai+ ; B+), 
any phenotypes (Ai- ; B+) offer the possibility of developing a receiving strain of 
phenotype (Ai- ; B+) capable of being co-transformed by a second metagenomic library 
(figure 4). This library can be the same as the first library or can be a distinct library. 

30 This alternative embodiment makes it possible to exploit, within the metagenomic 
library, the clones capable of converting at least one of the substrates {Ai} into target 
product {B} but incapable of metabolising {B} (clones not selected in step 4 of the first 
embodiment). This alternative embodiment involves several successive steps : 
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20 
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a) Providing a population of transformable host cells (Ai- ; B+). This optional 
step is only necessary if one wishes to transform the population of host cells (Ai- ; B+) 
by a metagenomic or genomic library of DNA selectable by the same resistance marker. 
This facultative step involves several successive steps (Figure 4) : 
5 - plasmidic purification of one or more clones having the phenotype (Ai- ; B+), 

- in vitro mutagenesis, advantageously by transposition as described above for 
step 5 with a transposable element carrying a functional resistance to an antibiotic 
absent on the target vector (for example resistance to apramycin), on the purified 
clone or clones, 

10 - transformation of a population of host cells (Ai- ; B-) and selection of the 

mutations affecting the gene resistant to the antibiotic of the target vector (for 
example resistance to chloramphenicol). Verification of the growth of this receiving 
strain on minimum medium with the target product {B} as the only source of carbon 
makes it possible to rule out the possibility of smother mutation event (transposition) 

15 altering the function of using {B} for growth. The receiving strain of the organism is 

then available, capable of growing on apramycin and on minimum medium 
containing the target product {B} as the only source of an essential element. 

- b) Transformation of the receiving strain by a metagenomic library of DNA 
20 and selection of recombinant clones on minimum medium containing n substrates { Ai} 

(i between 1 and n) as the only source(s) of an essential element and the two antibiotics 
of the receiving strain and the recombinant vectors. 

- c) The clones selected on minimum medium containing the n substrates {Ai} 
25 (figure 4, clones 1, 2), are preserved, and then tested in parallel on minimum medium 

containing one of the substrates {Ai}. For a given substrate {Ai}, this multiple selection 
makes it possible to identify colonies having a phenotype (Ai+ ; B+) expressing a priori 
the capability of the clone to grow on the target product {B} by the conversion of the 
substrate {Ai} into {B}. 

30 

- d) Plasmidic purification of one or more clone(s) having the phenotype (Ai+ ; 
B+). The sample of purified plasmids (figure 5) contains in a mixture the recombinant 
vector Apra R (metabolic pathway enabling growth on the target product {B}) and the 
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recombinant vector Cat R (metabolic pathway implementing the conversion of the 
substrate {Ai} into {B}). 

- e) The host cell of phenotype (Ai- ; B-) is transformed with the mixture of 
purified recombinant vectors into e) and the recombinant clones Cat R Apra s are 
selected. The conversion of the substrate {Ai} into {B} is verified by evaluating the 
accumulation of {B} by techniques from analytical chemistry when these clones Cat R 
Apra s develop on rich medium. The accumulation of {B} confirms that these clones 
Cat R Apra s do indeed have a phenotype (Ai+ ; B-). 

- f) Plasmidic purification of one or more clone(s) having the phenotype (Cat R 
Apra ) identified in f). The recombinant vector is subjected to genetic disruption, 
advantageously by the random insertion of transposable elements. 

15 - g) Transformation of the receiving strain (Ai- ; B+) by the population of 

recombinant vectors mutagenised into g). The transformants are first of all spread over 
rich solid medium until colonies appear, then the colonies are transferred by replicas on 
minimum medium containing just the substrate {Ai} as the source of an essential 
element. The parallel selection of the clones resulting from mutagenesis on rich medium 

20 and on minimum medium coi^aining just the substrate {Ai} makes impossible to 
identify the clones which have become incapable of growing with the substrate {Ai} as 
the source of an essential element, i.e. in which the disruption affects the metabolic 
pathway converting {Ai} into {B}. 

25 - h) The genetic characterisation of the biocatalyst, i.e. characterisation of the 

gene or genes encoding the enzyme or enzymes involved in the conversion of {Ai} into 
{B}, is implemented by means of transposed clones having the phenotype (Ai- ; B+). 
The genetic analysis of the nucleic sequences located on the disruption site or sites of 
the recombinant clones (Ai- ; B+) makes it possible to elucidate the genetic system(s) 

30 responsible for the conversion of {Ai} into {B}. 
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Example 1: Search for the metabolic pathway for the bioconversion of phytosterols 
into 4-androstene-3,17-dione (AD) 

4-androstene-3,17-dione (AD, CAS N° 63-05-8) and l,4-androstadiene-3,17- 
dione (ADD, CAS N°897-06-3) are important intermediaries for the pharmaceutical 
industry, as key precursors for the production of therapeutic steroids. Numerous 
microorganisms have the natural capability of degrading 3p-hydroxy-A5-sterols (for 
example P-sitosterol, campesterol or brassicasterol) by forming AD and ADD as 
degradation intermediaries. 




p-Sitosterol 




^ AD 




Brassicasterol 



Other sterols 



27 

The microbic conversion of natural phytosterols into AD by characterised 
bacterial strains has been widely described (Shashabi B. Mahato et al, 1997 Advances in 
microbial steroid biotransformation, steroids, 62, 332-345), in particular by 
Mycobacterium sp mutant strains. However, this bioconversion comes up against a 
5 number of limitations : the poor solubility of the phytosterols used as substrates, the 
poor yields associated with significant fermentation times and the concomitant 
production of AD and ADD which necessitates difficult and costly separation of these 
two steroid products. 

Moreover, scientific studies (van der Geize et aL, 2002, Microbiology, 
10 148 :3285-3292) conducted on Rhodococcus erythropolis have demonstrated that 
inactivation of the 3-ketosteroid dehydrogenase (KSTD) enzyme, involved in the 
catabolism of AD, was not sufficient in order to prevent the growth of R, erythropolis 
on AD, as the only source of carbon and energy. 

1 5 The aim of the strategy adopted is to detect the metabolic pathways enabling the 

specific conversion of phytosterols of different AD origin, eliminating the AD catabolic 
pathways. The metabolic pathways are explored in parallel within a BAC genomic 
library of Mycobacterium vaccae and within a BAC library of metagenomic DNAs 
originating from soil. 

20 

The host organism retained is Streptomyces lividans, which meets all of the 
criteria : cultivatable, transformable bacterium, capable of expressing metabolic 
pathways originating from Mycobacterium and bacteria with a high GC level from soil, 
known for their capability of degrading phytosterols, and incapable of growing on 

25 minimum medium supplemented with phytosterols and androstenedione as the only 
sources of carbon and energy. 

S. lividans is transformed by the aforementioned DNA libraries. The 
transformants are deposited on a solid M9 medium to which is added 10fig/L of 
chloramphenicol containing 0.5% phytosterols and 0.5% androstenedione (AD) as the 

30 only source of carbon and energy. They are incubated for 5-10 days at 30°C. The clones 
retained which are capable of growing are then tested separately on minimum medium 
supplemented with phytosterols on the one hand and AD on the other hand. The 
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capability of growing on phytosterols and/or on AD is linked to the DNAs introduced 
during the transformation. 

The clones capable of growing both on phytosterols and on AD as the only 
sources of carbon are selected, their vectors are re-extracted and then subjected to in 
5 vitro mutagenesis by transposition (EZ :TN Epicentre Technologies transposition kit). 
The mutated recombinant vectors are re-introduced into S. lividans. The transformants 
are deposited on rich medium and are tested in parallel on a solid M9 medium to which 
is added lO^g/L of chloramphenicol containing 0.5% phytosterols on the one hand, and 
0.5% androstenedione (AD) on the other hand, as the only source of carbon and energy. 
10 The S. lividans clones having lost the capability of growing on phytosterols but capable 
of growing on AD are selected, and their recombinant vector re-extracted. The genetic 
characterisation of metabolic pathways involved in the conversion of the phytosterols 
into AD is implemented by sequencing said recombinant vectors previously selected. 

15 Example 2: Search for the metabolic pathway for the biocon version of l-phenyl-2- 
propanone into l-phenyl-2-propanol 

Phenyl-2-propanol (CAS N° 103-79-7) is a compound widely used as a structural motif 
making up numerous active principles (Liese, A. et al, 2000, Industrial 
Biotransformation ed Wiley- VCH 103-106). It is found in particular as an intermediary 
20 for the synthesis of amphetamines (Bracher, F. et al, 1994, Arch. Pharm. 327, 591-593). 
Bacteria such as Rhodococcus erythropolis (Liese, A. et al 2000), but also the yeast 
Saccharomyces Cerevisiae (Gillois, J. et al, 1989, Journal of Organometallic Chemistry 
367(1-2), 85-93) have been described in order to catalyse the reaction A shown below, 
but with reactional yields of around 70%. 

25 




1 -phenyl-2- 1 -phenyl-2- 

propanone propanol 



Reaction A : Bioconversion of l-phenyl-2-propanone into l-phenyl-2-propanol. 
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The following strategy is adopted in order to select a novel and very effective metabolic 
pathway family which catalyses the bioconversion of l-phenyl-2-propanone into 1- 
phenyl-2-propanol (CAS N° 698-87-3). 

5 The culture media are sterilised in the autoclave at 121 °C for 20 minutes. Casamino 
acid, L-Tryptophane and Thiamine HC1 are cold sterilised using a 0.2 |im millipore 
membrane and are added to the culture medium following sterilisation, l-phenyl-2- 
propanone and l-phenyl-2-propanol are dissolved in ethanol and cold sterilised, then 
filtered using a 0.2 ^m millipore membrane before being incorporated into the gelose. 

10 

The E. Coli DH10B (LifeTechnologies, Gibco BRL) host strain is previously spread 
over a solid M9 medium (composition for 1 litre : 6.0 g Na 2 HP0 4 ; 3.0 g KH2PO4; 1 .0 g 
NaCl ; 2.0 g glucose ; 0.25 g MgS0 4 , 7H 2 0 ; 15.0 mg CaCl 2 , 2H 2 0 ; 5.0 g Casamino 
acids; 40.0 mg L-Tryptophane ; 1.0 mg Thiamine HC1 ; distilled water qsp) containing 
15 0.5% (V/V) l-phenyl-2-propanone (Aldrich ref 13,538-0, CAS 103-79-7) or 1-phenyl- 
2-propanol (Aldrich ref 14,923-5, CAS 14898-87-4) as the only source of carbon and 
energy. The Petri dishes are left to incubate for 18-24 hrs at 30°C. No clones should 
appear under these conditions, demonstrating that the E. Coli DH10B strain is incapable 
of catabolising 1 -phenyl-2-propanone or l-phenyl-2-propanol. 

20 

This strain is then transformed by the libraries of environmental DNA prepared and 
produced according to the operating conditions described in patents WO 01/81367 and 
EP N°02291871.8. The transformants are deposited on a solid M9 medium to which is 
added 10)Lig/L of chloramphenicol (cold sterilised, as described above) containing 0.5% 
25 1 -phenyl-2-propanone or l-phenyl-2-propanol as the only source of carbon and energy. 
They are incubated for 18-24 hrs at 30°C. The clones retained are those which are 
capable of growth with, as the only source of carbon and energy, l-phenyl-2-propanone 
and l-phenyl-2-propanol. The possibility of using these 2 compounds as the only source 
of carbon and energy is linked to the metagenomic DNA carried by the cloning vectors. 

30 

The vectors which carry the metagenomic DNAs in question are obtained and then 
subjected to random mutagenesis, for example by insertion (transposon kit Epicentre 
Technologies EZ : TN). They are then re-introduced into the E. Coli DH10B strain and 
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the cells transformed in this way are once again deposited onto a solid M9 medium to 
which is added lOjag/L chloramphenicol and containing 0.5% 1 -phenyl-2-propanone or 
l-phenyl-2-propanol as the only source of carbon and energy. They are incubated for 
18-24 hrs at 30°C. 

5 

The clones retained are those which prove to be incapable of growth with l-phenyl-2- 
propanol and 1 -phenyl-2-propanone as the only source of carbon. In fact, the pathway 
for the bioconversion of 1 -phenyl-2-propanone into l-phenyl-2-propanol is present in 
these clones, and the mutation has deactivated this pathway. A chromatographic 
10 analysis using liquid or gaseous Chromatography makes it possible both to verify this 
transformation and to exclude any false positives. 

Example 3 : Search for the metabolic pathway for the bioconversion of 
mandelonitrile into mandelic acid and of mandelonitrile into mandelamide 

15 The study of the microbiological hydrolysis of nitriles into carboxylic acid or into amide 
has been described in great detail (Wieser, M. et al (2000), Stereoselective biocatalysis 
Ed Patel RN "Stereoselective nitrile-converting enzymes" ; Ryuno, K. et al (2003), 
Yuki Gosei Kagaku Kyokaishi 61(5), 517-522 ; Okumura, M. (1991), JETI 39(6), 90- 
2 ; Endo, R., et al (2001), Jpn. Kokai Tokkyo Koho , Application: JP 2000-124591 

20 20000425). 

Patent EP-A-0 348 901 describes the preparation of R(-)-mandelic acids by hydrolysis 
of racemic mandelonitrile by a preparation either of Alcaligenes fecalis, ATCC 8750 
strain, or of Pseudomonas vesicularis, ATCC 11426 strain, or of Candida tropicalis, 
25 ATCC 20311 strain. It proposes producing optically active a-substituted carboxylic 
acids from nitriles or from racemic a-substituted amides with the help of certain 
microorganisms from the group made up of the Alcaligenes, Pseudomonas, 
Rhodopseudomonas, Corynebacterium, Acinetobacter, Bacillus, Mycobacterium and 
Rhodococcus genuses as well as a yeast, namely Candida. 
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In patent EP-A-449 648 or US-A-5 296 373, a process is described for producing the 
acid enantiomer, R(-)-mandelic substituted from a racemic substituted mandelonitrile by 
mixing with a preparation of a Rhodococcus bacterium, HT 29-7 (FERM BP-3857) 
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strain which guarantees the stereoselective hydrolysis of the nitrile group of the racemic 
in order to, apparently, avoid the disadvantages of the separation of other optically 
active substances obtained after hydrolysis by the microorganisms proposed in EP-A-0 
348 901. Patent EP-A-0 610 048 proposes using, in a similar reaction, microorganisms 
5 of the Gordona genus, such as Gordona terrae MA-1 (FERM BP-4535). 

One of the main problems encountered with these reactions is the appearance of 
numerous sub-products (aldehydes) and the short half-life duration of the enzyme due to 
the poisoning of the catalyst by the nitriles. The object of this example proposes 
10 producing from mandelonitrile either chiral mandelic acid (Reaction B), or 
mandelamide (Reaction C) without the problems generally encountered and with strong 
specific activity and a high reactional yield. 
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Reaction B : Bioconversion of mandelonitrile into mandelic acid 
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Reaction C : Bioconversion of mandelonitrile into mandelamide 



The strategy used is identical to that adopted in example 1, simply with substitution of 
the starting substrates and the products arrived at. Mandelonitrile (CAS N°532-28-5), 
mandelic acid (CAS N°90-64-2) and mandelamide (CAS N°44 10-3 1-5) are placed in 
25 solution in acetonitrile and prepared extemporaneously before being used. 
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The metabolic pathways catalysing the desired reaction are confirmed by a liquid 
chromatography analysis. 



